Contextual language generation by leveraging language understanding

ABSTRACT

Technology is provided for improving digital assistant performance by generating and presenting suggestions to users for completing a task or a session. To generate the suggestions, a machine learned language prediction model is trained with features extracted from multiple sources, such as log data and session context. When input is received from a user, the trained machine learned language prediction model is used to determine the most likely suggestion to present to the user to lead to successful task completion. In generating the suggestion, intermediate suggestion data, such as a domain, intent, and/or slot, is generated for the suggestion. From the generated intermediate suggestion data for the suggestion, a surface form of the suggestion is generated that can be presented to the user. The resulting suggestion and related context may further be used to continue training the machine learned language prediction model.

BACKGROUND

Digital assistant applications that can receive requests to performtasks for users continue to grow in popularity. Many of theseapplications are being incorporated into personal computers, laptops,mobile devices, as well as other similar types of devices. While theabilities and types of tasks that a digital assistant is able to performare substantial, users of digital assistants are often unaware of thetotal extent of the operations a digital assistant application canperform. Users may not understand how to use the digital assistantapplication or what they need to say next to get the result they desire.

It is with respect to these and other general considerations thatexamples have been made. Also, although relatively specific problemshave been discussed, it should be understood that the examples shouldnot be limited to solving the specific problems identified in thebackground.

SUMMARY

The disclosure generally relates to technology for improving digitalassistant performance by generating and presenting suggestions to usersfor completing a task or a session. Because users often do notunderstand the best way to use the digital assistant or what the mostpopular commands are, the suggestions provide helpful guidance to theuser for completing the user's desired task or continuing in a sessionwith the digital assistant. To generate the suggestions, a machinelearned language prediction model is trained with features extractedfrom multiple sources, such as log data representing prior interactionsbetween users and digital assistants. The model may be trained offline.When input is received from a user, the trained machine learned languageprediction model is used to determine the most likely suggestion topresent to the user to lead to successful task completion. For instance,the suggestion may be in response to the user requesting information orfor the digital assistant to perform other tasks. For instance, themachine learned language prediction model may provide a suggestion tothe user to perform the most commonly requested subsequent task based onthe log data. The suggestion may also be generated in response to theactivation of the digital assistant. In generating the suggestion, themachine learned language prediction module may be used to determineintermediate suggestion data, such as a domain, intent, and/or slot, forthe suggestion. From the determined intermediate suggestion data for thesuggestion, a surface form of the suggestion is generated that can bepresented to the user. The surface form of the suggestion is agrammatical, natural language command, phrase, or sentence that the usercan understand. The resulting suggestion and related context may furtherbe used to continue training the machine learned language predictionmodel.

This summary is provided to introduce a selection of concepts in asimplified form that are further described below in the DetailedDescription. This summary is not intended to identify key features oressential features of the claimed subject matter, nor is it intended tobe used to limit the scope of the claimed subject matter.

BRIEF DESCRIPTION OF THE DRAWINGS

Non-limiting and non-exhaustive examples are described with reference tothe following Figures.

FIG. 1 depicts an environment for receiving input to a client device.

FIG. 2 depicts a system for utilizing contextual language generationwith a digital assistant application.

FIG. 3 illustrates an example of a machine-language-based classifier.

FIG. 4 depicts a method for generating a machine learned languageprediction model for use in a digital assistant.

FIG. 5 depicts a method for generating a suggestion in a digitalassistant.

FIG. 6 depicts a method for updating a machine learned languageprediction model for use in a digital assistant.

FIG. 7 is a block diagram illustrating example physical components of acomputing device with which examples of the disclosure may be practiced.

FIGS. 8A and 8B are simplified block diagrams of a mobile computingdevice with which examples of the present disclosure may be practiced.

FIG. 9 is a simplified block diagram of a distributed computing systemin which examples of the present disclosure may be practiced.

FIG. 10 illustrates a tablet computing device for executing one or moreexamples of the present disclosure.

DETAILED DESCRIPTION

In the following detailed description, references are made to theaccompanying drawings that form a part hereof, and in which are shown byway of illustrations specific embodiments or examples. These aspects maybe combined, other aspects may be utilized, and structural changes maybe made without departing from the spirit or scope of the presentdisclosure. The following detailed description is therefore not to betaken in a limiting sense, and the scope of the present disclosure isdefined by the appended claims and their equivalents.

The present disclosure relates generally to improving the technologybehind intelligent digital assistant applications, such as the CORTANAdigital assistant application offered by Microsoft Corporation ofRedmond, Wash., or the SIRI digital assistant application offered byApple Incorporated of Cupertino, Calif., hereinafter referred to as“digital assistant”. As digital assistants become more advanced andcapable of handling more operations, a gap may form between the productdeveloper's view of what the product can do and what end users think theproduct can do. Users may not understand the boundaries of the digitalassistant, and the users may also not understand the depth of knowledgeor operations that the digital assistant can use or perform. Theseproblems are exacerbated if multiple user inputs are required tocomplete a particular task. For instance, when a user asks a digitalassistant a question or to perform a task, the user may not understandwhat types of follow-up questions or tasks can be performed. As anotherexample, if the digital assistant provides an answer or result that theuser was not expecting, the user is often unaware how to correct theproblem or rephrase the question in a way that causes the digitalassistant to provide the expected answer. In complex tasks, such assearching, browsing, and finding a place according to a number offilters, e.g., distance, rating, opening hours, etc., users can easilyget confused about what to say to the digital assistant.

As an example multi-turn scenario of interacting with a digitalassistant, a user in a first turn may state “Find Chinese restaurants inBellevue.” The digital assistant may respond with five Chineserestaurants that are in Bellevue. The user then may be unaware what heor she can ask in a second turn. For instance, some of the possiblestatements that user could make include “show me the ones near Lincolnsquare,” “show me high rated ones,” “show me the ones open on Monday,”“show me the ones that are take-out,” “show me driving directions to the1st one,” among many others. It would be unreasonable to display eachand every option that the user could select, yet the user needs sometype of assistance or suggestion as to what options are available.

Other attempts to address these problems have focused on using discreterules to handle discrete situations. These efforts, however, can onlyhandle a limited amount of scenarios because a human applicationdeveloper must develop each rule for any scenario that the humanapplication developer identifies. Identifying specific scenarios anddeveloping discrete rules is time-intensive and cannot be easily orquickly expanded to cover additional scenarios.

The present technology improves the functionality of the digitalassistant operations by creating a system for a digital assistantapplication that is able to provide suggestions to users. Thesuggestions provide the user with an indication of additional input thatthe user can provide to the digital assistant. The suggestions mayinclude corrected input from what the user had provided previouslyentered, or the suggestions may also include further inputs that theuser can provide to more fully utilize the functionalities availablefrom the digital assistant application, among other possiblesuggestions. The suggestions are generated, in part, through the use ofa machine learned language prediction model that is trained based ondata from previous uses of one or more digital assistant applications.By using a model based on rich data sets from users of digital assistantapplications, suggestions for a substantial amount of scenarios can begenerated without the need for creating and parsing discrete rules. Inaddition, the machine learned language prediction model can also handlea larger number of scenarios than could be handled with discrete rules.

As used herein, a turn is an interaction from a user with a digitalassistant application during a session. A session includes aconversation (one or more turns from the user with correspondingresponses from the digital assistant) between a user and a digitalassistant. For instance, the session may start when the digitalassistant is activated and end when the digital assistant isdeactivated. The session may also begin where the digital assistantdetects a request for a task to be performed. A multi-turn scenario is asession where more than one input is received and processed by thedigital assistant.

Turning to the figures, FIG. 1 depicts an environment 100 for providinginput to a client device. The environment 100 including a user 102 and aclient device 104. The client device 104 may be any suitable device,such as those devices described below with reference to FIGS. 7-10. Theuser 102 in environment 100 can provide input into the client device 104having a digital assistant via voice input, as shown in FIG. 1. The user102 may also provide input to the client device 104 through textualinput by using a soft-keyboard or other type of text input device. Theuser may also provide input to the client device 104 through gesturesrecognized by, e.g., a touch-screen feature, motion detection, or acamera on the client device. Many options for providing input to theclient device 104 are known to those having skill in the art and arecontemplated here. Upon receiving the input from the user 102, theclient device 104, executing a digital assistant application, determineswhat the user 102 has requested and provides an answer to the user'squestion or performs the requested tasks. Further, as explained below,the digital assistant generates suggestions to the user based on thecontext of the interactions with the user. In addition, as alsodescribed below, performing the requested tasks may also involve the useof other components, such as network-based services and components.

The environment 100 may change as the user 102 accesses a separateclient device such as a laptop computer or personal computer. Theenvironment 100 may also change as the user 102 changes location. Forinstance, as shown in FIG. 1, the client device 104 may be a mobiledevice such as a smartphone or a tablet computer system that can betransported to different locations by the user, and the user 102 maydesire to use the digital assistant application on the client device 104in multiple environments. The changes in environments may be measured ordetermined by the client device 104, which can then be used by a digitalassistant application as context when generating suggestions for theuser 102. For example, a change in the location of the device may beused as input for generating suggestions.

FIG. 2 depicts an example of a system 200 for generating suggestions fora user of a digital assistant. As shown, system 200 includes aconversation and system context module 202, a machine learned languageprediction module 204 a log data module 206, a machine-language-basedclassifier 208, a language generation suggestion module 210, and aninput/output module 212. In some examples, each of the components shownin FIG. 2 may be incorporated into a client device, such as clientdevice 104. In other examples, some of the components shown in FIG. 2may be incorporated into a server that is communicatively connected to aclient device having the remainder of the components. In addition, eachof the components shown in FIG. 2 may be incorporated into a digitalassistant or used in conjunction with a digital assistant. As usedherein, “modules” include hardware, software, or a combination ofhardware and software configured to perform the processes and functionsdescribed herein. In addition, as used herein, an “application”comprises instructions that, when executed by one or more processors,performs processes and functions described herein.

The log data module 206 accesses data from previous interactions betweenusers and digital assistant applications. The interactions may bespecific to a particular user or may be from the interactions ofmultiple users with multiple digital assistant applications. The logdata contains a rich set of data that may include any information thatcould affect the performance of a task by a digital assistantapplication. For example, the log data may include information such asdevice identifications, the location of the device during theinteraction, the questions, responses, and inputs during a previoussession, items being displayed or otherwise output during the session,contacts referenced, calendar information referenced, the time of thesession, along with any other information about a previous session orstate of the client device during the session. The log data may alsoinclude information, such as click-through rates, regarding the user'sinteractions with the responses provided by the digital assistantapplication. In addition, the log data may also include all theoperations that the user has performed with the client device during aparticular time period, such as within a day. In some examples, the dataitems included in the log data are also considered inputs to a digitalassistant when processing a request in a current session as well. Thelog data module 206 provides the log data to the machine learnedlanguage prediction module 204. In some cases, the log data module 206also formats the log data into a format that is usable by the machinelearned language prediction module 204.

The machine learned language prediction module 204 develops, based atleast in part on the log data received from the log data module 206, amachine learned language prediction module for use in generatingsuggestions for a user using a digital assistant. The machine learnedlanguage prediction model may be developed with a multitude ofstatistical machine learning based techniques. In some examples, themachine learned language prediction module 204 develops the machinelearned language prediction model as an artificial neural network,Bayesian classifier, genetically derived algorithms, or any othermachine learning models available to those having skill in the art. Forinstance, the machine learned language prediction model may generate aprobability distribution across different suggestions. In examples, themachine learned language prediction model generates a probabilitydistribution across some or all of the domains, intents, and/or slotsthat the digital assistant is capable of handling.

The machine learned language prediction model is trained with featuresextracted from sources including, but not limited to, previous userqueries (e.g., word/phrase ngram features) in the session, previousdomain, intent, and slots assigned to the previous queries, as well assession context such as turn number. The features may also include userspecific history or personalization features. Additional features arealso extracted from system response and results, such as a list ofrestaurants, movies, people, etc., depending on the user's request in agiven domain. All of these features may be defined collectively as thecontext features. These features are aggregated across all or a subsetof the log data and used to train the machine learned languageprediction models. Model training is typically done offline, and duringruntime the context features are computed and entered into a decoder ormodule, such as the machine-language-based classifier 208, that uses thetrained model to determine the most likely suggestion presented to theuser for task completion or a successful session. However, modeltraining may also be done at runtime to continuously update the model.As used herein, a task may include requests for information or queriesto the digital assistant.

In some cases, the machine learned language prediction module 204provides the generated machine learned language prediction model to themachine-language-based classifier 208. In other cases, the machinelearned language prediction module 204 may provide access to themachine-language-based classifier 208 in order to allow themachine-language-based classifier 208 to utilize the trained machinelearned language prediction model.

The machine learned language prediction module 204 may also generate andtrain the predictive model based on data received from the conversationand system context module 202. The conversation and system contextmodule 202 provides user context and session data signals to the machinelearned language prediction module 204. For instance, the conversationand system context module 202 provides a user profile or sessionspecific information that may be useful in generating a predictivemodel. The user profile may include information about a particular useralong with information about the settings on a particular device. Theinformation about the user and the settings of the device both may havean impact on how the digital assistant performs a particular task. Theinformation about the user and the settings of the device also provideadditional context into the user's preferences. In some examples, thedata and information provided to the machine learned language predictionmodule 204 from the conversation and system context module may besubstantially embodied in the log data received from the log data module206. In those examples, it may not be necessary for the machine learnedlanguage prediction model to receive data from the conversation andsystem context module 202.

The machine-language-based classifier 208 utilizes the machine learnedlanguage prediction model to generate intermediate suggestion data, suchas a set of a domains, intents, and slots, for generating a suggestionto present to the user. The suggestion may suggest to the user asubsequent task that the digital assistant can complete for the user.The machine-language-based classifier 208 receives at least some of thesame types of inputs that were used by the machine learned languageprediction module 204 to generate the predictive model. As discussedabove, the machine-language-based classifier 208 receives informationfrom the conversation and system context module 202, such as a contextfor the user. The information received from the conversation and systemcontext module 202 may be the same information provided to the machinelearned language prediction module 204. Input received at a clientdevice from a user is also provided to the machine-language-basedclassifier 208.

In addition, the machine-language-based classifier 208 may also processthe input received from the user requesting the digital assistantperform a task. In processing the user input, the machine-language-basedclassifier analyzes the user input to determine intermediate task data,such as a domain, intent, and slot, corresponding to the first task. Inaddition to the context for the user, that intermediate task datacorresponding to the first task may also be input to the machine learnedlanguage prediction model to generate the intermediate suggestion datafor generating the suggestion.

Turning to FIG. 3, in examples, the machine-language-based classifier208 may have a domain prediction module 302, an intent prediction module304, and a slot prediction module 306. The domain prediction module 302predicts a domain to be included as part of a suggestion. A domain maybe considered a general category or container within the presentframework. For example, there may be a domain for “places” and a domainfor “calendar.” With those examples, the places domain may relate totasks such as getting directions for finding particular places, such asrestaurants or stores. The calendar domain may relate to date-basedtasks, such as setting reminders or appointments. Based on the inputs tothe machine-language-based classifier 208, the domain prediction module302 predicts a most likely domain for a suggestion to the user. Thedomain prediction module 302 utilizes at least a component of themachine learned language prediction model generated by machine learnedlanguage prediction module 204. The domain prediction module 302 alsomay utilize input from the input/output module 212 and the data receivedfrom the conversation and system context module 202.

The intent prediction module 304 predicts an intent to be included aspart of a suggestion. An intent may be considered an action to beperformed. In some examples, within the present framework, each domainmay have a multitude of intents relating to the particular domain. Forexample, within the places domain, there may be an intent for“get_directions.” Based on the inputs to the machine-language-basedclassifier 208, the intent prediction module 304 predicts a most likelyintent for a suggestion to the user. The intent prediction module 304utilizes at least a component of the machine learned language predictionmodel generated by machine learned language prediction module 204. Theintent prediction module 304 also may utilize input from theinput/output module 212 and the data received from the conversation andsystem context module 202.

The slot prediction module 306 predicts a slot or slots to be includedas part of a suggestion. A slot may be considered a filter or an entity.Slots, however, can be more than entities. A slot may be considered anargument for a function or call that can be used to complete a task oranswer a query. For instance, a city, state, or other location may beslot value for a task of finding directions or finding a store inparticular location. When searching for a restaurant, the type ofcuisine, hours of operation, whether the restaurant has a bar, etc. mayall be considered slots. Similarly, another slot could be the locationof the restaurant or whether the restaurant is near a particular city.The slot prediction module 306 utilizes at least a component of themachine learned language prediction model generated by machine learnedlanguage prediction module 204. The slot prediction module 306 also mayutilize input from the input/output module 212 and the data receivedfrom the conversation and system context module 202.

Returning to FIG. 2, the language generation suggestion module 210receives the intermediate suggestion data for generating the suggestion,such as one or more predicted domains, intents, and slots from themachine-language-based classifier 208. Based on the predictedintermediate suggestion data, the language generation suggestion module210 generates a suggestion to be presented to the user. In generatingthe suggestion to be presented to the user, the language generationsuggestion module 210 determines a surface form of the suggestion basedon the intermediate suggestion data, such as the one or more predicteddomains, intents, and slots. A surface form of a suggestion is a formthat is recognizable and understandable to the user. For instance, thesurface form of a suggestion to get driving directions to a location maybe “You can get driving directions to any business, just say it.”

Once the language generation suggestion module 210 generates the surfaceform of the suggestion, the surface form of the suggestion is presentedto the input/output module 212, which presents the suggestion to theuser. The suggestion may be presented to the user in any form availableby the client device of the user. For example, the suggestion may bepresented visually on a screen or audibly through a speaker. Thelanguage generation suggestion module 210, in conjunction with themachine-language-based classifier 208, may also utilize the machinelearned language prediction model generated by the machine learnedlanguage prediction module 204.

As an example, because the machine learned language prediction model isbased off of previous interactions with users of a digital assistant,the suggestion may be based on the most likely follow-up input providedduring past interactions or sessions. For instance, if a user first asks“Find a home improvement store,” the digital assistant applicationdetermines the intent of the request to be “find_place.” That intent andthe context for the user is provided as input to the machine learnedlanguage prediction model, which determines the most likely follow-upinput. For example, if 40% of the time, as exposed through use of themachine learned language prediction model, the “find_place” intent isfollowed by the “get_direction” and 20% of the time it is followed by“get_store_hours” intent, the language generation suggestion module 210and the machine-language-based classifier 208 dynamically generatesuggestions according to these probabilities to guide the user towardadditional task completion.

Each of the generated suggestions and interactions with the user mayalso be included in the log data and reintroduced into the machinelearned language prediction module 204 to continuously update themachine learned language prediction model based on the most recentinteractions, effectively creating a feedback loop. For instance, asshown in FIG. 2, the language generation suggestion module 210 providesthe generated suggestion to the log data module 206. The languagegeneration suggestion module 210 may also provide additional data, suchas the user input and the intermediate suggestion data, such as thepredicted domains, intents, and slots for the suggestion, to the logdata module 206.

The functionality of the modules depicted in FIG. 2 may also beperformed either offline or during runtime. For instance, as shown inFIG. 2, the log data module 206, the machine learned language predictionmodule 204, and the conversation and system context module 202 may allperform their functions offline. The log data module 206 may gather andorganize the log data offline to conserve or best utilize resources.Similarly, the machine learned language prediction module 204 may alsogenerate the machine learned language prediction model offline. Theconversation and system context module 202 may also organize some dataoffline. In some examples, the conversation and system context module202 operate during runtime to collect information about the presentsession or conversation between the user and the digital assistantapplication.

The machine-language-based classifier 208 operates during runtime toutilize the machine learned language prediction model to generateintermediate suggestion data for generating a suggestion, includingpredicting the domains, intents, and slots for the suggestion based onthe inputs discussed above. Similarly, the language generationsuggestion module 210 also operates during runtime to generate thesurface of the suggestion. The input/output module 212 also operatesduring runtime to receive input from the user and to provide output tothe user. The machine learned language prediction module 204 may alsooperate at runtime to continuously update the machine learned languageprediction model with additional log data.

FIG. 4 depicts a method 400 for generating a machine learned languageprediction model for use in a digital assistant. While the methodologiesherein are shown and described as being a series of acts that areperformed in a sequence, it is to be understood and appreciated that themethodology is not limited by the order of the sequence. For example,some acts can occur in a different order than what is described herein,as will be appreciated by those skilled in the art. In addition, someacts can occur concurrently with another act. Further, in someinstances, not all acts may be required to implement a methodologydescribed herein. Moreover, the acts described herein may becomputer-executable instructions that can be implemented by one or moreprocessors and/or stored on a computer-readable medium or media. Thecomputer-executable instructions can include a routine, a sub-routine,programs, a thread of execution, and/or the like. Still further, resultsof acts of the methodology can be stored in a computer-readable device,displayed on a display device, and/or the like.

At operation 402, log data is received. The log data includes historicaldata representing previous interactions between one or more users andone or more digital assistant applications, and may specifically includethe types of data discussed above. As also discussed above, the log datais used to train a model to reveal patterns of interactions with digitalassistant applications. At operation 404, click through data isreceived. The click through data represents selections of a user inprevious sessions or interactions. In some examples, the click throughdata is incorporated in the log data, and therefore the click throughdata need not be separately received in operation 404.

At operation 406, the log data, and the click through data whereincluded, are analyzed to extract filters from previous queries in thelog data. Input received from users to a digital assistant applicationmay generally be considered requests to complete tasks, such as queriesor requests for information. While discussed as queries or tasks herein,the technology is equivalently applicable to user input requesting anyactions or information from the digital assistant, which may beconsidered generally as requests for tasks. The filters, or in someexamples slots, are extracted from the query to determine the frequentfilters that provided by the users. Extracting the filters from thequeries or requests may also assist in revealing the underlying intentsof the requests or queries.

At operation 408, follow-up queries are also extracted from the log dataand the click through data. The follow-up queries are generally onlyavailable in multi-turn scenarios where the user has provided multipleinputs. The follow-up queries reveal patterns of the inputs from theuser based on the responses that were presented to the user by a digitalassistant application. For example, patterns may be revealed that 40% ofthe time users request directions to a place after searching for thatplace.

At operation 410, a machine learned language prediction model is trainedor generated based on at least the log data and the click through dataalong with the extracted filters and follow-up queries, where available.As will be appreciated by those having skill in the art, the machinelearned language prediction model may be of many different types thatcan be trained using data containing inputs and outputs. For instance,the machine learned language prediction model may be a neural network, astatistical regression model, or support vector machines (SVMs). Themachine learned language prediction model is used to expose the patternsthat exist in the log data and the click through data. These patternsmay often be based on the interactions with the user, such as frequentlyrequested tasks, frequently asked queries, types of filters frequentlyused with particular tasks or queries, along with many other patternsand relationships between inputs and outputs that can be revealed byutilizing a machine learned language prediction model.

The machine learned language prediction model may also be based onintermediate task data, such as the domains, intents, and slots thatcorrespond to the tasks, queries, or other inputs in the log data. Thedomains, intents, and slots may be explicitly included in the log dataor may be extracted during the method 400 to be used in training themachine learned language prediction model. In some examples, multiplepredictive models may be trained or generated. For instance, a firstpredictive model may be generated for domains, a second predictive modelmay be generated for intents, and a third predictive model may begenerated for slots.

FIG. 5 depicts a method 500 for generating a suggestion for a user in adigital assistant. At operation 502, input is be received by a digitalassistant. In examples where the user is initiating a first turn, theuser will provide input requesting the digital assistant to perform atask. In some examples, however, a suggestion may be generated prior toa user providing any requests or queries to the digital assistantapplication at operation 502. For example, method 500 may be triggeredor started upon the user activating the digital assistant prior to theuser providing any queries or requests. In such cases, the inputreceived in operation 502 is an input activating the digital assistant.In that example, a suggestion may be presented to the user prior toreceiving any input from the user. In such an example, the suggestionmay be generated using the machine learned language prediction model topredict the most likely first input into a digital assistant.

At operation 504, context is received. Context includes a broad amountof data, including data about the current session and context about theuser and the client device. For instance, the context may include muchof the same data that was used to train the machine learned languageprediction model, such as the data in the log data and the click throughdata. The context may include a user profile or session specificinformation that can be used as input to the machine learned languageprediction model to predict intermediate suggestion data for generatinga suggestions, such as domains, intents, and slots to be used ingenerating a suggestion. The user profile includes information about aparticular user, and in some examples, information about the settings ona particular device. The information about the user and the settings ofthe device both may have an impact on the suggestion that is generated.The information about the user and the settings of the device alsoprovide additional insight into the user's preferences. Accordingly,different suggestions may be generated based on the context, even wherethe recent input from the user is the same.

At operation 506, in examples where user input is received, intermediatetask data corresponding to the requested task, such as domain, intent,and slots for the received input, is determined. The domains, intents,and slots assist in classifying and identifying the particular task orquery that the user has requested. Based on the determined domains,intents, and slots, the digital assistant application may also performthe requested task or respond to the query at operation 506.

At operation 508, a trained machine learned language prediction model isapplied to determine intermediate suggestion data for generating asuggestion for the user. In examples, the intermediate suggestion dataincludes a domain, intent, and/or slot for a suggestion to be presentedto the user. The trained machine learned language prediction model maybe a predictive model trained at operation 410 of method 400 or thepredictive model generated by the machine learned language predictionmodule 204 in system 200. The machine learned language prediction modelis applied by providing as inputs to the machine learned languageprediction model the context for the user identified in operation 504and the intermediate task data, where available, determined in operation506. The machine learned language prediction model receives and analyzesthe inputs to generate intermediate suggestion data, such as domains,intents, and/or slots, for generating a suggestion. In some examples,only a domain, intent, or slot need be generated by the machine learnedlanguage prediction model in operation 508. In other examples, acombination of a domain, intent, and/or slot is generated in operation508. The trained predictive model determines the most likely domain,intent, and/or slot for a suggestion based on the data from which themodel was trained. For instance, if the model determines that the bestsuggestion for the user does not need to include a slot, the slot maynot be determined or generated. As an example, if upon activation of thedigital assistant, the predictive model determines that most users askabout the weather, particular slots or filters need not be determined togenerate and present a suggestion to the user to ask about the weather.

Based on the intermediate suggestion data for generating a suggestion, asurface form for the suggestion is generated at operation 510. Thesurface form of the suggestion is a form that is recognizable andunderstandable to the user. For instance, if the predicted domain forthe suggestion is “places,” the predicted intent for the suggestion is“get_directions,” and the predicted slot for the suggestion is“Microsoft Headquarters,” the user would likely not understand apresentation of that data alone. Rather, it is useful to present thatinformation in a form that the user can understand, i.e., a surfaceform. In the above example, the surface form may be “You can ask fordirections to Microsoft Headquarters.” Such a suggestion may begenerated and presented after the user had provided the input of “FindMicrosoft Headquarters” and the digital assistant had returned thelocation of Microsoft Headquarters.

In some examples, multiple suggestions may be generated. For instance,if the machine learned language prediction model determines that anintent for getting directions has the highest probability for beingrequested next, and a request for getting weather information has asecond highest probability, two suggestions may be generated—one fordirections and one for weather. Where two suggestions are generated, twosurface forms may be generated. In other examples, a combined surfaceform including both suggestions may be generated.

At operation 512, the surface form of the suggestion generated inoperation 510 is output to the user via the client device. The surfaceform of the suggestion may be presented in a multitude of ways,including visually through a display screen or audibly through aspeaker.

Once the user has seen the suggestion, the user will likely follow thesuggestions and provide additional input to the digital assistantrequesting the digital assistant to perform the suggested task. Theadditional input from the user is received at operation 514. Theadditional input from the user is then handled in a similar manner tothat of initial input. For example, operations 504-514, or a subsetthereof, may be repeated for the additional input. The context receivedin operation 504, however, may additionally include the input andresponse from the previous turn in the session or may have changed inother ways, including the location of the user. In completing operations504-514 for the additional input, a second suggestion may be presentedto the user.

FIG. 6 depicts a method 600 for updating a machine learned languageprediction model for use in a digital assistant application. Atoperation 602, log data is updated by adding the intermediate suggestiondata for generating the suggestion, such as the domain, intent, and/orslot that were utilized in generating a suggestion for the user, asgenerated in method 500. The surface form of the suggestion may also beadded to the log data. At operation 604, the current context for whichthe suggestion was based is also added to a contextual database. In someexamples, the context may be added directly to the log data rather thana separate contextual database. At operation 606, the machine learnedlanguage prediction model used to determine the suggestion is updatedbased on the updated log data and context in the contextual database.Updating the predictive model may be completed by performing method 400with the updated log data and context data. By updating the machinelearned language prediction model based on the generated suggestions andthe underlying intermediate suggestion data, including the domains,intents, and/or slots for generating the suggestion, a feedback loop iscreated that continuously improves the effectiveness of the machinelearned language prediction model.

As an illustration of the methods presented herein, take for example auser that purchases a new device having digital personal assistant. Theuser has a goal or task in mind, but the user does not understand whatthe digital assistant can do or how to properly express the request. Forinstance, assume the user has just moved to Washington and does not knowthe area. The user has a goal in mind of getting driving directions toMicrosoft headquarters. The user initializes the digital assistant andin a first turn states “Microsoft headquarters.” The digital assistantprocesses the input and returns the location of Microsoft headquarters.The user often does not know what to say next. Thus, the technologyherein is able to predict a suggestion for the user based on previousinteractions from the user and other users with digital assistants. Thetechnology predicts a most likely subsequent task, and generatesintermediate suggestion data, such as a domain, intent, and/or slot, fora suggestion for the subsequent task to present to the user. Based onthe machine learned language prediction model trained on log data, aprobability distribution across multiple intents may reveal that arequest for a place is most often followed by a request for directionsto that place. In such a scenario, a suggestion such as “I can get youdirections to Microsoft headquarters, would you like me to do that?”,can be presented to the user. The user then understands that the digitalassistant is capable of getting directions, and can provide additionalinput to have the digital assistant get directions to Microsoftheadquarters.

As another illustration, a user may not understand the extent of thedomains that are supported by a digital assistant or whether the digitalassistant is capable of carrying over context from a previous request.For instance, take a user who wants to drive to the Microsoft Store inDenver, Colo. In a first turn, the user states “Find Microsoft Store inDenver.” The digital assistant processes the request and provides alocation of the Microsoft Store. The user again may not know whatcapabilities of the digital assistant are available. Similar to theabove illustration, the present technology is capable of presenting asuggestion to the user of follow-up inputs that the user may provide.Depending on probabilities determined by the machine learned languageprediction model, the digital assistant may output a suggestion to theuser that states “You can also check the weather there.” By presentingthis suggestion, the user is now aware that another domain may beaccessed and the context of “Denver” will be carried over to the nextrequest. Multiple suggestions may also be created and presented to theuser. For instance, if the machine learned language prediction modeldetermines that users often request the hours for a store afterrequesting the location of the store, a secondary suggestion may also bepresented suggesting to the user suggesting to the user that the storehours may be requested. In such cases, the suggestions may be combinedinto a single surface form such as, “You can also check the weatherthere or ask for the hours of operation for the Microsoft store.”

FIG. 7 is a block diagram illustrating physical components (e.g.,hardware) of a computing device 700 with which examples of thedisclosure may be practiced. The computing device components describedbelow may have computer executable instructions for a speech recognitionapplication 713, e.g., of a client or server and/or computer executableinstructions for digital assistant application 711, e.g., of a client orserver, that can be executed to employ the methods disclosed herein.Digital assistant 711 may be on a client device such as client device104. Components of the digital assistant 711 may also be on a server.Similarly, speech recognition application 713 may be on a server. Speechrecognition application 713, or components thereof, may also be on aclient device, such as client device 104. In a basic configuration, thecomputing device 700 may include at least one processing unit 702 and asystem memory 704. Depending on the configuration and type of computingdevice, the system memory 704 may comprise, but is not limited to,volatile storage (e.g., random access memory), non-volatile storage(e.g., read-only memory), flash memory, or any combination of suchmemories. The system memory 704 may include an operating system 705 andone or more program modules 706 suitable for running softwareapplications 720 such digital assistants as discussed with regard toFIGS. 1-6. The operating system 705, for example, may be suitable forcontrolling the operation of the computing device 700. Furthermore,examples of the disclosure may be practiced in conjunction with agraphics library, audio library, speech database, speech synthesisapplications, other operating systems, or any other application programand is not limited to any particular application or system. This basicconfiguration is illustrated in FIG. 7 by those components within adashed line 708. The computing device 700 may have additional featuresor functionality. For example, the computing device 700 may also includeadditional data storage devices (removable and/or non-removable) suchas, for example, magnetic disks, optical disks, or tape. Such additionalstorage is illustrated in FIG. 7 by a removable storage device 709 and anon-removable storage device 710.

As stated above, a number of program modules and data files may bestored in the system memory 704. While executing on the processing unit702, the program modules 706 (e.g., digital assistant 711 or speechrecognition application 713) may perform processes including, but notlimited to, the examples as described herein. Other program modules thatmay be used in accordance with examples of the present disclosure, andin particular to generate screen content and audio content, may includeelectronic mail and contacts applications, word processing applications,spreadsheet applications, database applications, slide presentationapplications, drawing, messaging applications, mapping applications,speech-to-text applications, text-to-speech applications, and/orcomputer-aided application programs, intelligent assistant applications,etc.

Furthermore, examples of the disclosure may be practiced in anelectrical circuit comprising discrete electronic elements, packaged orintegrated electronic chips containing logic gates, a circuit utilizinga microprocessor, or on a single chip containing electronic elements ormicroprocessors. For example, examples of the disclosure may bepracticed via a system-on-a-chip (SOC) where each or many of thecomponents illustrated in FIG. 7 may be integrated onto a singleintegrated circuit. Such an SOC device may include one or moreprocessing units, graphics units, communications units, systemvirtualization units and various application functionality all of whichare integrated (or “burned”) onto the chip substrate as a singleintegrated circuit. When operating via an SOC, the functionality,described herein, with respect to the capability of client to switchprotocols may be operated via application-specific logic integrated withother components of the computing device 700 on the single integratedcircuit (chip). Examples of the disclosure may also be practiced usingother technologies capable of performing logical operations such as, forexample, AND, OR, and NOT, including but not limited to mechanical,optical, fluidic, and quantum technologies. In addition, examples of thedisclosure may be practiced within a general purpose computer or in anyother circuits or systems.

The computing device 700 may also have one or more input device(s) 712such as a keyboard, a mouse, a pen, a sound or voice input device, atouch or swipe input device, etc. Such input devices may be utilized inconjunction with input/output module 212. The output device(s) 714 suchas a display, speakers, a printer, etc. may also be included. Suchoutput devices may be utilized in conjunction with input/output module212. The aforementioned devices are examples and others may be used. Thecomputing device 700 may include one or more communication connections716 allowing communications with other computing devices 718. Examplesof suitable communication connections 716 include, but are not limitedto, RF transmitter, receiver, and/or transceiver circuitry; universalserial bus (USB), parallel, and/or serial ports.

The term computer readable media as used herein may include computerstorage media. Computer storage media may include volatile andnonvolatile, removable and non-removable media implemented in any methodor technology for storage of information, such as computer readableinstructions, data structures, or program modules. The system memory704, the removable storage device 709, and the non-removable storagedevice 710 are all computer storage media examples (e.g., memorystorage). Computer storage media may include RAM, ROM, electricallyerasable read-only memory (EEPROM), flash memory or other memorytechnology, CD-ROM, digital versatile disks (DVD) or other opticalstorage, magnetic cassettes, magnetic tape, magnetic disk storage orother magnetic storage devices, or any other article of manufacturewhich can be used to store information and which can be accessed by thecomputing device 700. Any such computer storage media may be part of thecomputing device 700. Computer storage media does not include a carrierwave or other propagated or modulated data signal. Computer storagemedia may be stored, incorporated into, or utilized in conjunction withcomputer storage devices.

Communication media may be embodied by computer readable instructions,data structures, program modules, or other data in a modulated datasignal, such as a carrier wave or other transport mechanism, andincludes any information delivery media. The term “modulated datasignal” may describe a signal that has one or more characteristics setor changed in such a manner as to encode information in the signal. Byway of example, and not limitation, communication media may includewired media such as a wired network or direct-wired connection, andwireless media such as acoustic, radio frequency (RF), infrared, andother wireless media.

FIGS. 8A and 8B illustrate a mobile computing device 800, for example, amobile telephone, a smart phone, wearable computer (such as a smartwatch), a tablet computer, a laptop computer, and the like, with whichexamples of the disclosure may be practiced. In some examples, theclient may be a mobile computing device. With reference to FIG. 8A, oneexample of a mobile computing device 800 for implementing the examplesis illustrated. In a basic configuration, the mobile computing device800 is a handheld computer having both input elements and outputelements. The mobile computing device 800 typically includes a display805 and one or more input buttons 810 that allow the user to enterinformation into the mobile computing device 800. The display 805 of themobile computing device 800 may also function as an input device (e.g.,a touch screen display). If included, an optional side input element 815allows further user input. The side input element 815 may be a rotaryswitch, a button, or any other type of manual input element. Inalternative examples, mobile computing device 800 may incorporate moreor less input elements. For example, the display 805 may not be a touchscreen in some examples. In yet another alternative example, the mobilecomputing device 800 is a portable phone system, such as a cellularphone. The mobile computing device 800 may also include an optionalkeypad 835. Optional keypad 835 may be a physical keypad or a “soft”keypad generated on the touch screen display. In various examples, theoutput elements include the display 805 for showing a graphical userinterface (GUI), a visual indicator 820 (e.g., a light emitting diode),and/or an audio transducer 825 (e.g., a speaker). In some examples, themobile computing device 800 incorporates a vibration transducer forproviding the user with tactile feedback. In yet another example, themobile computing device 800 incorporates input and/or output ports, suchas an audio input (e.g., a microphone jack), an audio output (e.g., aheadphone jack), and a video output (e.g., a HDMI port) for sendingsignals to or receiving signals from an external device.

FIG. 8B is a block diagram illustrating the architecture of one exampleof a mobile computing device. That is, the mobile computing device 800may incorporate a system (e.g., an architecture) 802 to implement someexamples. In one example, the system 802 is implemented as a “smartphone” capable of running one or more applications (e.g., browser,e-mail, calendaring, contact managers, messaging clients, games,text-to-speech applications, and media clients/players). In someexamples, the system 802 is integrated as a computing device, such as anintegrated personal digital assistant (PDA) and wireless phone.

One or more application programs 866 may be loaded into the memory 862and run on or in association with the operating system 864. Examples ofthe application programs include phone dialer programs, e-mail programs,personal information management (PIM) programs, word processingprograms, spreadsheet programs, Internet browser programs, messagingprograms, text-to-speech applications, and so forth. The system 802 alsoincludes a non-volatile storage area 868 within the memory 862. Thenon-volatile storage area 868 may be used to store persistentinformation that should not be lost if the system 802 is powered down.The application programs 866 may use and store information in thenon-volatile storage area 868, such as e-mail or other messages used byan e-mail application, and the like. A synchronization application (notshown) also resides on the system 802 and is programmed to interact witha corresponding synchronization application resident on a host computerto keep the information stored in the non-volatile storage area 868synchronized with corresponding information stored at the host computer.As should be appreciated, other applications may be loaded into thememory 862 and run on the mobile computing device 800, including theinstructions to generate suggestions with a digital assistantapplication.

The system 802 has a power supply 870, which may be implemented as oneor more batteries. The power supply 870 might further include anexternal power source, such as an AC adapter or a powered docking cradlethat supplements or recharges the batteries.

The system 802 may also include a radio 872 that performs the functionof transmitting and receiving radio frequency communications. The radio872 facilitates wireless connectivity between the system 802 and the“outside world,” via a communications carrier or service provider.Transmissions to and from the radio 872 are conducted under control ofthe operating system 864. In other words, communications received by theradio 872 may be disseminated to the application programs 866 via theoperating system 864, and vice versa.

The visual indicator 820 may be used to provide visual notifications,and/or an audio interface 874 may be used for producing audiblenotifications via the audio transducer 825. In the illustrated example,the visual indicator 820 is a light emitting diode (LED) and the audiotransducer 825 is a speaker. These devices may be directly coupled tothe power supply 870 so that when activated, they remain on for aduration dictated by the notification mechanism even though theprocessor 860 and other components might shut down for conservingbattery power. The LED may be programmed to remain on indefinitely untilthe user takes action to indicate the powered-on status of the device.The audio interface 874 is used to provide audible signals to andreceive audible signals from the user. For example, in addition to beingcoupled to the audio transducer 825, the audio interface 874 may also becoupled to a microphone to receive audible input, such as to facilitatea telephone conversation or capture speech for speech recognition. Inaccordance with examples of the present disclosure, the microphone mayalso serve as an audio sensor to facilitate control of notifications.The system 802 may further include a video interface 876 that enables anoperation of an on-board camera 830 to record still images, videostream, and the like.

A mobile computing device 800 implementing the system 802 may haveadditional features or functionality. For example, the mobile computingdevice 800 may also include additional data storage devices (removableand/or non-removable) such as, magnetic disks, optical disks, or tape.Such additional storage is illustrated in FIG. 8B by the non-volatilestorage area 868.

Data/information generated or captured by the mobile computing device800 and stored via the system 802 may be stored locally on the mobilecomputing device 800, as described above, or the data may be stored onany number of storage media that may be accessed by the device via theradio 872 or via a wired connection between the mobile computing device800 and a separate computing device associated with the mobile computingdevice 800, for example, a server computer in a distributed computingnetwork, such as the Internet. As should be appreciated suchdata/information may be accessed via the mobile computing device 800 viathe radio 872 or via a distributed computing network. Similarly, suchdata/information may be readily transferred between computing devicesfor storage and use according to well-known data/information transferand storage means, including electronic mail and collaborativedata/information sharing systems.

FIG. 9 illustrates one example of the architecture of a system forprocessing data received at a computing system from a remote source,such as a computing device 904, tablet 906, or mobile device 908, asdescribed above. Content displayed at server device 902 may be stored indifferent communication channels or other storage types. For example,various documents may be stored using a directory service 922, a webportal 924, a mailbox service 926, an instant messaging store 928, or asocial networking site 930. The digital assistant 711 may be employed bya client who communicates with server 902. The server 902 may providedata to and from a client computing device such as a personal computer904, a tablet computing device 906 and/or a mobile computing device 908(e.g., a smart phone) through a network 915. Components of the digitalassistant 711 may also reside on the server 902. For instance, themachine-language-prediction module 204 and log data module 206 mayreside on server 902. In some examples, the machine-language-basedclassifier 208 and the language generation suggestion module 210 mayalso reside on server 902. By way of example, the computer systemdescribed above may be embodied in a personal computer 904, a tabletcomputing device 906 and/or a mobile computing device 908 (e.g., a smartphone). Any of these examples of the computing devices may obtaincontent from the store 916, in addition to receiving graphical datauseable to be either pre-processed at a graphic-originating system, orpost-processed at a receiving computing system.

FIG. 10 illustrates an exemplary tablet computing device 1000 that mayexecute one or more examples disclosed herein. Tablet computing device1000 may comprise a client device 104. In addition, the examples andfunctionalities described herein may operate over distributed systems(e.g., cloud-based computing systems), where application functionality,memory, data storage and retrieval and various processing functions maybe operated remotely from each other over a distributed computingnetwork, such as the Internet or an intranet. User interfaces andinformation of various types may be displayed via on-board computingdevice displays or via remote display units associated with one or morecomputing devices. For example user interfaces and information ofvarious types may be displayed and interacted with on a wall surfaceonto which user interfaces and information of various types areprojected. Interaction with the multitude of computing systems withwhich examples of the invention may be practiced include, keystrokeentry, touch screen entry, voice or other audio entry, gesture entrywhere an associated computing device is equipped with detection (e.g.,camera) functionality for capturing and interpreting user gestures forcontrolling the functionality of the computing device, and the like.

Examples of the present disclosure, for example, are described abovewith reference to block diagrams and/or operational illustrations ofmethods, systems, and computer program products according to examples ofthe disclosure. The functions/acts noted in the blocks may occur out ofthe order as shown in any flowchart. For example, two blocks shown insuccession may in fact be executed substantially concurrently or theblocks may sometimes be executed in the reverse order, depending uponthe functionality/acts involved.

In addition, to protect the privacy of the user, any aggregation ofpotentially confidential data of or from a user or resulting from theinput of a user may first be anonymized prior to being utilized in thesystems and methods disclosed herein. Such anonymization may include theremoval of some or all metadata or other data that may connect theresults to be utilized to the individual user. The level of desiredanonymization may be selected or customized by the user.

From the foregoing disclosure, it should be appreciated that thedisclosure provides for various methods and systems, including a systemcomprising a machine-language-based classifier. Themachine-language-based classifier is configured to receive a first userinput requesting a digital assistant to perform a requested task;identify a first context for the user; analyze the first user input todetermine first intermediate task data corresponding to the requestedfirst task; provide the first context and the first intermediate taskdata as inputs to a machine learned language prediction model, whereinthe machine learned language prediction model is trained from log data,the log data comprising historical data representing previousinteractions between one or more users and one or more digital assistantapplications; and receive as output from the machine learned languageprediction model, first intermediate suggestion data for generating afirst suggestion for the user, wherein the first suggestion is for asecond task to be requested based on the inputs to the machine learnedlanguage prediction model. The system also comprises an output module,wherein the output module is configured to present the first suggestionto the user.

In addition, one having skill in the art will also appreciate theimprovements to the functionality of the computing system from thesystems and methods disclosed herein. For instance, by providingsuggestions to the users for the most likely next task to be completed,computing resources of the device and the digital assistant are saved bynot having to process continuous imperfect requests from the user. Asdiscussed above, in the past, users would continuously provide inputinto a digital assistant without understanding what the optimal inputswere to achieve their desired task. With the technology herein, suchinefficiencies can be lessened or eliminated. In addition, the devicecan also save resources by not having to process individual discreterules programmed by human programmers.

The description and illustration of one or more examples provided inthis application are not intended to limit or restrict the scope of thedisclosure as claimed in any way. The examples, examples, and detailsprovided in this application are considered sufficient to conveypossession and enable others to make and use the best mode of claimeddisclosure. Further, the terms “exemplary” and “illustrative” are meantonly to be indicative of examples, and not to designate one examplenecessarily being more useful or beneficial over any other example. Theclaimed disclosure should not be construed as being limited to anyembodiment, example, or detail provided in this application. Regardlessof whether shown and described in combination or separately, the variousfeatures (both structural and methodological) are intended to beselectively included or omitted to produce an embodiment with aparticular set of features. Having been provided with the descriptionand illustration of the present application, one skilled in the art mayenvision variations, modifications, and alternate embodiments fallingwithin the spirit of the broader aspects of the general inventiveconcept embodied in this application that do not depart from the broaderscope of the claimed disclosure.

1. A system comprising: a machine-language-based classifier, wherein themachine-language-based classifier is configured to perform the followingactions: receive a first user input requesting a digital assistant toperform a requested task; identify a first context for the user; analyzethe first user input to determine first intermediate task datacorresponding to the requested first task; provide the first context andthe first intermediate task data as inputs to a machine learned languageprediction model, wherein the machine learned language prediction modelis based on log data, the log data comprising historical datarepresenting previous interactions between one or more users and one ormore digital assistant applications; receive as output from the machinelearned language prediction model, first intermediate suggestion datafor generating a first suggestion for the user, wherein the firstsuggestion is for a second task to be requested based on the inputs tothe machine learned language prediction model; and an output module,wherein the output module is configured to present the first suggestionto the user.
 2. The system of claim 1, wherein the first intermediatesuggestion data includes at least one data type selected from the groupconsisting of a domain, an intent, and a slot for generating the firstsuggestion to the user.
 3. The system of claim 1, further comprising alanguage generation suggestion module configured to generate a surfaceform of the first suggestion based on the first intermediate suggestiondata.
 4. The system of claim 1, further comprising a log data moduleconfigured to receive the first intermediate suggestion data, the firstsuggestion, and the first context to the log data.
 5. The system ofclaim 1, further comprising: a machine learned language predictionmodule, wherein the machine learned language prediction module isconfigured to perform the following actions: receive updated log data;and training the machine learned language prediction model with theupdated log data to generate an updated machine learned languageprediction model.
 6. The system of claim 5, wherein themachine-language-based classifier is further configured to perform thefollowing actions: receive a second user input, wherein the second userinput is the same as the first user input; receive a second context forthe user; analyze the second user input to determine second intermediatetask data corresponding to the requested task; provide the secondcontext and the second intermediate task data as inputs to the updatedmachine learned language prediction model; receive as output from theupdated machine learned language prediction model, second intermediatesuggestion data for generating a second suggestion for the user, whereinthe second suggestion is different from the first suggestion; andwherein the output module is further configured to present the secondsuggestion to the user.
 7. The system of claim 1, wherein themachine-language-based classifier is further configured to perform thefollowing actions: receive a second user input, wherein the second userinput is the same as the first user input; receive a second context forthe user, wherein the second context is the different from the firstcontext; analyze the second user input to determine second intermediatetask data corresponding to the requested task; provide the secondcontext and the second intermediate task data as inputs to the machinelearned language prediction model; receive as output from the machinelearned language prediction model, second intermediate suggestion datafor generating a second suggestion for the user, wherein the secondsuggestion is different from the first suggestion; and wherein theoutput module is further configured to presenting the second suggestionto the user.
 8. The system of claim 2, wherein the machine learnedlanguage prediction model includes a predictive model for domains, apredictive model for intents, and a predictive model of slots.
 9. Acomputer-implemented method comprising: receiving a first user inputrequesting a digital assistant to perform a requested task; identifyinga first context for the user; analyzing the first user input todetermine first intermediate task data corresponding to the requestedfirst task; providing the first context and the first intermediate taskdata as inputs to a machine learned language prediction model, whereinthe machine learned language prediction model is trained from log data,the log data comprising historical data representing previousinteractions between one or more users and one or more digital assistantapplications; receiving as output from the machine learned languageprediction model, first intermediate suggestion data for generating afirst suggestion for the user, wherein the first suggestion is for asecond task to be requested based on the inputs to the machine learnedlanguage prediction model; and presenting the first suggestion to theuser.
 10. The method of claim 9, wherein the first intermediatesuggestion data includes at least one data type selected from the groupconsisting of: a domain, an intent, and a slot for generating the firstsuggestion to the user.
 11. The method of claim 9, further comprisinggenerating a surface form of the first suggestion based on theintermediate suggestion data.
 12. The method of claim 9, furthercomprising: receiving updated log data; training the machine learnedlanguage prediction model while offline with the updated log data togenerate an updated machine learned language prediction model.
 13. Themethod of claim 9, further comprising: receiving updated log data;training the machine learned language prediction model at runtime withthe updated log data to generate an updated machine learned languageprediction model.
 14. The method of claim 13, further comprising:receiving a second user input, wherein the second user input is the sameas the first user input; receiving a second context for the user;analyzing the second user input to determine second intermediate taskdata corresponding to the requested task; providing the second contextand the second intermediate task data as inputs to the updated machinelearned language prediction model; receiving as output from the updatedmachine learned language prediction model, second intermediatesuggestion data for generating a second suggestion for the user, whereinthe second suggestion is different from the first suggestion; andpresenting the second suggestion to the user.
 15. The method of claim 9,further comprising: receiving a second user input, wherein the seconduser input is the same as the first user input; receiving a secondcontext for the user, wherein the second context is the different fromthe first context; analyzing the second user input to determine secondintermediate task data corresponding to the requested task; providingthe second context and the second intermediate task data as inputs tothe machine learned language prediction model; receiving as output fromthe machine learned language prediction model, second intermediatesuggestion data for generating a second suggestion for the user, whereinthe second suggestion is different from the first suggestion; andpresenting the second suggestion to the user.
 16. The method of claim 9,wherein the machine learned language prediction model includes apredictive model for domains, a predictive model for intents, and apredictive model of slots.
 17. A system comprising: a machine learnedlanguage prediction module, wherein the machine learned languageprediction module is configured to: receive log data from a log datamodule; train a machine learned language prediction model based on logdata comprising historical data representing previous interactionsbetween one or more users and one or more digital assistantapplications; and provide a machine-based-classifier access to themachine learned language prediction model, such that themachine-based-classifier can provide inputs to the machine learnedlanguage prediction model in the form of user input and context for theuser and receive outputs from the machine learned language predictionmodel in the form of intermediate suggestion data; and  a log datamodule configured to: receive log data from interactions betweenmultiple users and digital assistants; and provide the log data to themachine learned language prediction module.
 18. The system of claim 17,wherein the intermediate suggestion data includes at least one data typeselected from the group consisting of a domain, an intent, and a slotfor generating the suggestion to the user.
 19. The system of claim 17,wherein the machine learned language prediction module is furtherconfigured to receive updated log data and update the machine learnedlanguage prediction model by training the machine learned languageprediction model with the updated log data.
 20. The system of claim 17,wherein the context for the user includes a user profile and devicesettings.